Engineering posts about Deep Learning

Curated summaries and key learnings for engineers working with Deep Learning.

Making User-Sequence Data More Cost-Efficient, Faster, and Easier to Use

This article discusses the redesign of a user-sequence platform aimed at improving the efficiency, speed, and usability of user data for machine learning applications. It addresses the challenges...

Salesforce

How Salesforce Built an AI Security Agent for Autonomous Threat Triage

The article outlines how Salesforce developed the SATA agent, an AI-driven system designed to enhance cybersecurity by autonomously triaging threats across complex environments. It highlights the...

Google

Google Tensor SDK Beta with LiteRT

The Google Tensor ML SDK has transitioned from an Experimental Access Program to Beta, enabling developers to leverage the capabilities of the Google Tensor System-on-Chip (SoC) and its dedicated...

Salesforce

Creating a Multi-Tenant AI Agent Platform Handling 7K+ Sessions Without Cross-Team Interference

The article outlines the development of the Bring Your Own Planner (BYOP), a multi-tenant AI agent platform designed to enhance team autonomy and scalability within Salesforce. It addresses the...

Meta (Facebook)

Reel Friends: Building Social Discovery that Scales to Billions

In the Meta Tech Podcast episode featuring Pascal Hartig, the engineering intricacies behind the 'Friend Bubbles' feature of Facebook Reels are explored. The discussion highlights the evolution of...

Enhancing Ad Relevance: Integrating Real-Time Context into Sequential Recommender Models

The article presents a novel approach to enhancing ad relevance by integrating real-time context into sequential recommender models. It highlights the limitations of previous models that relied...

Databricks

Pushing the Frontier for Data Agents with Genie

The article presents Genie, a sophisticated data agent developed by Databricks, designed to enhance the analysis of both structured and unstructured enterprise data. It highlights the challenges...

Databricks

11m

Addressing HR's widening capacity gap with AI

The article outlines the pressing challenges faced by HR departments in the wake of increasing demands and limited resources, highlighting the widening capacity gap exacerbated by post-pandemic...

Databricks

How Superhuman and Databricks built a 200K QPS inference platform together

The article describes the collaboration between Superhuman and Databricks in developing a high-performance inference platform capable of handling over 200,000 queries per second (QPS) with stringent...

Apple

Text-Conditional JEPA for Learning Semantically Rich Visual Representations

The article introduces Text-Conditional JEPA (TC-JEPA), a new framework for learning semantically rich visual representations by leveraging image captions to modulate predicted features. This...

Apple

What Matters in Practical Learned Image Compression

The article presents a comprehensive study on learned image compression codecs, emphasizing their optimization for the human visual system. It highlights the development of a new codec that...

Apple

From Where Things Are to What They’re For: Benchmarking Spatial–Functional Intelligence for Multimodal LLMs

The paper introduces the Spatial-Functional Intelligence Benchmark (SFI-Bench), aimed at evaluating the advanced reasoning capabilities of multimodal large language models (MLLMs). It highlights the...

Apple

Normalizing Flows with Iterative Denoising

The article presents advancements in Normalizing Flows (NFs) through the introduction of iterative TARFlow (iTARFlow), a generative model that combines autoregressive generation with iterative...

Apple

SpecMD: A Comprehensive Study on Speculative Expert Prefetching

The article presents SpecMD, a standardized framework designed for benchmarking caching strategies in Mixture-of-Experts (MoE) models. It highlights the importance of an expert caching mechanism to...

Apple

Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing

The article discusses a novel approach to Key-Value (KV) caching in transformer language models, focusing on reducing memory footprint while maintaining high throughput during autoregressive...

Databricks

28m

Generative AI for Business: A Complete Strategy and Implementation Guide

The article discusses the transformative potential of generative AI in business, highlighting its ability to create significant economic value across various sectors. It emphasizes the importance of...

Databricks

11m

LLM Vs AI: A Practical Guide to Differences, Use Cases, and Tools

This article serves as a comprehensive guide to understanding the distinctions between large language models (LLMs) and the broader field of artificial intelligence (AI). It outlines the scope, core...

Apple

PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning

The article introduces PORTool, an importance-aware policy optimization algorithm designed for multi-tool-integrated reasoning in large language model (LLM) empowered agents. It addresses the...

Google

12m

Supercharging LLM inference on Google TPUs: Achieving 3X speedups with diffusion-style speculative decoding

The article discusses advancements in Large Language Model (LLM) inference acceleration through the implementation of block diffusion speculative decoding, specifically the DFlash method, on Google...

Salesforce

How AI-Driven Kubernetes Optimization Reclaimed Millions from 47% Idle Capacity

The article discusses Salesforce's challenges with infrastructure scaling on its Hyperforce platform, particularly regarding over-provisioning and idle capacity in Kubernetes services. It introduces...